Objective

The purpose of this lab is to continue learning a journalistic approach to data analysis in R.

We will continue to do things learned in previous labs:

  • Writing R code for data analysis and exploration in the R Studio environment, using R projects (.Rproj) and R markdown files (.Rmd).
  • Loading, cleaning, making sense of and analyzing data using the Tidyverse framework of packages by selecting certain columns, sorting and filtering
  • Create new columns in our data set based on information in other columns.
  • Summarizing data by grouping and calculating min, max, median and mean values.
  • Store changes on GitHub.
  • Join together two related data sets on a common field.
  • Do some additional data cleaning, including fixing dates so we can work with them.
  • To make visualizations.

Today, we’ll also learn:

  • To do some basic mapping using the GGPlot2 package.

How this works, tasks, turning it in, getting help

This document is mostly set up for you to follow along and run code that I have written, and listen to me explain it.

At several points throughout this document, you will see the word Task.

That indicates I’m expecting you to modify the file I’ve given you, usually by creating a codeblock and writing some custom code.

When you are finished, you should save your R markdown file and Knit it as an HTML file.

You should upload it to GitHub, using GitHub desktop.

And the links to your project is what you’ll post on ELMS.

Need help? You are welcome to do the following things:

Setup

Take the following steps to set up your document:

  1. Download the ZIP file and open the folder on your desktop.
  2. Create a new folder in your git repo and move it in there. Unzip the folder.
  3. Open this file in RStudio.
  4. Rename this file “lab_07_FIRSTNAME_LASTNAME.Rmd”.
  5. Create a new R project inside of this folder, which will set the working directory in this folder.

Load Packages

We’re loading seven packages today. five of these we’ve loaded previously: the Tidyverse (for general data science goodness and visualizing charts and maps), janitor (for data cleaning), arcos (for loading WaPo opioid data) and tidycensus (for loading census data) and scales for cleaning up axis labels and legends.

We’re also going to load two new packages: mapview (for making interactive maps) and ggthemes (for doing cool styling stuff).

Task: In the code block below, load the packages we’ll need for today.

# Load Tidyverse, janitor and arcos, tidycensus, mapview, ggthemes, scales

# install.packages('tidycensus') 
# install.packages('mapview')
# install.packages('ggthemes') 

library(tidyverse)
library(janitor)
library(arcos)
library(scales)
library(tidycensus)
library(mapview)
library(ggthemes)

Using the ARCOS R Package

For this exercise, we will be working with subsets of the DEA’s ARCOS database, which documented shipments of 76 billion opioid pills between 2006 and 2012, during the peak of the opioid epidemic.

The data was obtained after a lengthy legal battle by the Washington Post and the Charleston Gazette-Mail, and released by the Washington Post in raw and aggregated form. Washington Post “Digging into the DEA’s pain pill database” page.

A data dictionary is available here: ARCOS Registrant Handbook.

We’re going to load the data exclusively from the arcos R package API ARCOS API produced by the Washington Post, instead of uploading csvs and tsvs.

Remember, we need to store a password of sorts – called an API key – that will give us permission to access their data. Here’s a list of API keys that will work.

Let’s store the key first.

# store one of our API keys as an object called key

key <- "uO4EK6I"

Load ARCOS data

Our goal today will be to build several maps that show which counties and states received more pills than other parts of the country, adjusted for population. So, to do that, we’ll need to acquire both pill shipment data and population data, then calculate the pills per person statistic.

The ARCOS API has a table with total pills for each county for each year between 2006 and 2012. Let’s pull that down and clean and standardize the column headers with clean_names() . Remember to include the key.

arcos_county_pills_per_year <- summarized_county_annual(key = key) %>%
  clean_names()

The ARCOS API also has a table with county population estimates for each year between 2006 and 2012. Let’s pull that down and clean and standardize the names. Remember to include the key.

arcos_county_population_per_year <- county_population(key = key) %>%
  clean_names()

As we saw in previous labs, there’s an inconsistent number of records between the two tables. This is because there are some counties in the population table that aren’t in the pills table, and vice versa. We can see what these are using the anti_join() function from the Tidyverse. Note: we need to join by the countyfips code and year.

# 999 records in our population table without a countyfips+year match in our pills table
not_in_pills <- arcos_county_population_per_year %>%
  anti_join(arcos_county_pills_per_year, by=c("countyfips","year"))

# 595 records in our pills table without a countyfips+year match in population table.  
not_in_population <- arcos_county_pills_per_year %>%
  anti_join(arcos_county_population_per_year, by=c("countyfips","year"))

Task: Examine the not_in_population table. Write a few sentences explaining patterns you see in which records show up in the pills table, but not in population table. How could this cause problems for you in the future?

# The majority of records that show up in the pills table, but not the population table pertain to U.S. territories. Without an inner join to get rid of these rows -- that is, if you wanted them to be factored in -- data analyses will be complicated by the fact that any entry for the territories will lack information.

Okay, let’s put them together. Remember we need to join them on both countyfips and year. We’re also going to throw in buyer_county and buyer_state, which exist in both tables, so we don’t get the buyer_county.x and buyer_county.y weirdness.

We’re going to inner join here with population on the left side, because we want a consistent set of U.S. counties for when we do mapping later. And we’re going to make a judgment call that it’s okay to exclude pill shipments for which there were no assigned counties, which is what we’re doing when we do the left join.

pills_population <- arcos_county_population_per_year %>%
  left_join(arcos_county_pills_per_year, by = c("countyfips", "year", "buyer_county","buyer_state"))

Now we’re going to clean up our data set a bit by first creating a new column with the state code, by taking the first two digits from our countyfips column, using str_sub() from the stringr package. We’ll need this to map states later on.

pills_population <- pills_population %>%
   mutate(statefips = str_sub(countyfips, 1,2)) %>%
   select(statefips, countyfips, buyer_county, buyer_state,year, population,dosage_unit)

Good. Now we have a nice table we can use to build our maps.

Before we do that, we’ll need to get some shapefiles, or collections of coordinates to draw the map, be it states or counties.

Load Shapefiles

There are lots of places to get shapefiles for all sorts of geographies. The U.S. Census is one of the best! They have a repository called TIGER where you can get a lot. Luckily for us, the tidycensus package allows us to pull in this data directly.

First, we need to store our API Key. You can use mine for now, but best to get your own.

# Define API Key
census_api_key("549950d36c22ff16455fe196bbbd01d63cfbe6cf")

# If you need to look up variables, this is how you do it
acs_variables <- load_variables(2017, "acs5" )

The tidycensus allows us to pull in information from the census by many different geographies – and pull in the “geometry” for the geographic units we define with it.

We’re going to pull in four sets of data here.

First, lets pull in county and state. The “variable” here is just a count of total population, but it doesn’t really matter. We just want the geometry, which we get by selecting geometry=TRUE.

county_geodata <- get_acs(geography = "county",
              variables = "B01001_001", geometry = TRUE)
## 
  |                                                                            
  |                                                                      |   0%
  |                                                                            
  |                                                                      |   1%
  |                                                                            
  |=                                                                     |   1%
  |                                                                            
  |=                                                                     |   2%
  |                                                                            
  |==                                                                    |   2%
  |                                                                            
  |==                                                                    |   3%
  |                                                                            
  |===                                                                   |   4%
  |                                                                            
  |===                                                                   |   5%
  |                                                                            
  |====                                                                  |   5%
  |                                                                            
  |====                                                                  |   6%
  |                                                                            
  |=====                                                                 |   7%
  |                                                                            
  |=====                                                                 |   8%
  |                                                                            
  |======                                                                |   8%
  |                                                                            
  |======                                                                |   9%
  |                                                                            
  |=======                                                               |   9%
  |                                                                            
  |=======                                                               |  10%
  |                                                                            
  |=======                                                               |  11%
  |                                                                            
  |========                                                              |  11%
  |                                                                            
  |========                                                              |  12%
  |                                                                            
  |=========                                                             |  12%
  |                                                                            
  |=========                                                             |  13%
  |                                                                            
  |==========                                                            |  14%
  |                                                                            
  |==========                                                            |  15%
  |                                                                            
  |===========                                                           |  15%
  |                                                                            
  |===========                                                           |  16%
  |                                                                            
  |============                                                          |  16%
  |                                                                            
  |============                                                          |  17%
  |                                                                            
  |============                                                          |  18%
  |                                                                            
  |=============                                                         |  18%
  |                                                                            
  |=============                                                         |  19%
  |                                                                            
  |==============                                                        |  19%
  |                                                                            
  |==============                                                        |  20%
  |                                                                            
  |==============                                                        |  21%
  |                                                                            
  |===============                                                       |  21%
  |                                                                            
  |===============                                                       |  22%
  |                                                                            
  |================                                                      |  22%
  |                                                                            
  |================                                                      |  23%
  |                                                                            
  |=================                                                     |  24%
  |                                                                            
  |=================                                                     |  25%
  |                                                                            
  |==================                                                    |  25%
  |                                                                            
  |==================                                                    |  26%
  |                                                                            
  |===================                                                   |  27%
  |                                                                            
  |===================                                                   |  28%
  |                                                                            
  |====================                                                  |  28%
  |                                                                            
  |====================                                                  |  29%
  |                                                                            
  |=====================                                                 |  29%
  |                                                                            
  |=====================                                                 |  30%
  |                                                                            
  |=====================                                                 |  31%
  |                                                                            
  |======================                                                |  31%
  |                                                                            
  |======================                                                |  32%
  |                                                                            
  |=======================                                               |  32%
  |                                                                            
  |=======================                                               |  33%
  |                                                                            
  |========================                                              |  34%
  |                                                                            
  |========================                                              |  35%
  |                                                                            
  |=========================                                             |  35%
  |                                                                            
  |=========================                                             |  36%
  |                                                                            
  |==========================                                            |  37%
  |                                                                            
  |==========================                                            |  38%
  |                                                                            
  |===========================                                           |  38%
  |                                                                            
  |===========================                                           |  39%
  |                                                                            
  |============================                                          |  39%
  |                                                                            
  |============================                                          |  40%
  |                                                                            
  |============================                                          |  41%
  |                                                                            
  |=============================                                         |  41%
  |                                                                            
  |=============================                                         |  42%
  |                                                                            
  |==============================                                        |  42%
  |                                                                            
  |==============================                                        |  43%
  |                                                                            
  |===============================                                       |  44%
  |                                                                            
  |===============================                                       |  45%
  |                                                                            
  |================================                                      |  45%
  |                                                                            
  |================================                                      |  46%
  |                                                                            
  |=================================                                     |  46%
  |                                                                            
  |=================================                                     |  47%
  |                                                                            
  |=================================                                     |  48%
  |                                                                            
  |==================================                                    |  48%
  |                                                                            
  |==================================                                    |  49%
  |                                                                            
  |===================================                                   |  49%
  |                                                                            
  |===================================                                   |  50%
  |                                                                            
  |====================================                                  |  51%
  |                                                                            
  |====================================                                  |  52%
  |                                                                            
  |=====================================                                 |  52%
  |                                                                            
  |=====================================                                 |  53%
  |                                                                            
  |======================================                                |  54%
  |                                                                            
  |======================================                                |  55%
  |                                                                            
  |=======================================                               |  55%
  |                                                                            
  |=======================================                               |  56%
  |                                                                            
  |========================================                              |  56%
  |                                                                            
  |========================================                              |  57%
  |                                                                            
  |========================================                              |  58%
  |                                                                            
  |=========================================                             |  58%
  |                                                                            
  |=========================================                             |  59%
  |                                                                            
  |==========================================                            |  59%
  |                                                                            
  |==========================================                            |  60%
  |                                                                            
  |==========================================                            |  61%
  |                                                                            
  |===========================================                           |  61%
  |                                                                            
  |===========================================                           |  62%
  |                                                                            
  |============================================                          |  62%
  |                                                                            
  |============================================                          |  63%
  |                                                                            
  |=============================================                         |  64%
  |                                                                            
  |=============================================                         |  65%
  |                                                                            
  |==============================================                        |  65%
  |                                                                            
  |==============================================                        |  66%
  |                                                                            
  |===============================================                       |  67%
  |                                                                            
  |===============================================                       |  68%
  |                                                                            
  |================================================                      |  68%
  |                                                                            
  |================================================                      |  69%
  |                                                                            
  |=================================================                     |  70%
  |                                                                            
  |=================================================                     |  71%
  |                                                                            
  |==================================================                    |  71%
  |                                                                            
  |==================================================                    |  72%
  |                                                                            
  |===================================================                   |  72%
  |                                                                            
  |===================================================                   |  73%
  |                                                                            
  |====================================================                  |  74%
  |                                                                            
  |====================================================                  |  75%
  |                                                                            
  |=====================================================                 |  75%
  |                                                                            
  |=====================================================                 |  76%
  |                                                                            
  |======================================================                |  77%
  |                                                                            
  |======================================================                |  78%
  |                                                                            
  |=======================================================               |  78%
  |                                                                            
  |=======================================================               |  79%
  |                                                                            
  |========================================================              |  79%
  |                                                                            
  |========================================================              |  80%
  |                                                                            
  |========================================================              |  81%
  |                                                                            
  |=========================================================             |  81%
  |                                                                            
  |=========================================================             |  82%
  |                                                                            
  |==========================================================            |  82%
  |                                                                            
  |==========================================================            |  83%
  |                                                                            
  |===========================================================           |  84%
  |                                                                            
  |===========================================================           |  85%
  |                                                                            
  |============================================================          |  85%
  |                                                                            
  |============================================================          |  86%
  |                                                                            
  |=============================================================         |  87%
  |                                                                            
  |=============================================================         |  88%
  |                                                                            
  |==============================================================        |  88%
  |                                                                            
  |==============================================================        |  89%
  |                                                                            
  |===============================================================       |  89%
  |                                                                            
  |===============================================================       |  90%
  |                                                                            
  |===============================================================       |  91%
  |                                                                            
  |================================================================      |  91%
  |                                                                            
  |================================================================      |  92%
  |                                                                            
  |=================================================================     |  92%
  |                                                                            
  |=================================================================     |  93%
  |                                                                            
  |=================================================================     |  94%
  |                                                                            
  |==================================================================    |  94%
  |                                                                            
  |==================================================================    |  95%
  |                                                                            
  |===================================================================   |  95%
  |                                                                            
  |===================================================================   |  96%
  |                                                                            
  |====================================================================  |  97%
  |                                                                            
  |====================================================================  |  98%
  |                                                                            
  |===================================================================== |  98%
  |                                                                            
  |===================================================================== |  99%
  |                                                                            
  |======================================================================| 100%
state_geodata <- get_acs(geography = "state",
              variables = "B01001_001", geometry = TRUE)
## 
  |                                                                            
  |                                                                      |   0%
  |                                                                            
  |=                                                                     |   1%
  |                                                                            
  |=                                                                     |   2%
  |                                                                            
  |==                                                                    |   3%
  |                                                                            
  |===                                                                   |   4%
  |                                                                            
  |===                                                                   |   5%
  |                                                                            
  |====                                                                  |   5%
  |                                                                            
  |====                                                                  |   6%
  |                                                                            
  |=====                                                                 |   7%
  |                                                                            
  |=====                                                                 |   8%
  |                                                                            
  |======                                                                |   9%
  |                                                                            
  |=======                                                               |  10%
  |                                                                            
  |========                                                              |  11%
  |                                                                            
  |========                                                              |  12%
  |                                                                            
  |=========                                                             |  12%
  |                                                                            
  |=========                                                             |  13%
  |                                                                            
  |=========                                                             |  14%
  |                                                                            
  |==========                                                            |  14%
  |                                                                            
  |==========                                                            |  15%
  |                                                                            
  |===========                                                           |  16%
  |                                                                            
  |============                                                          |  17%
  |                                                                            
  |============                                                          |  18%
  |                                                                            
  |=============                                                         |  19%
  |                                                                            
  |==============                                                        |  20%
  |                                                                            
  |==============                                                        |  21%
  |                                                                            
  |===============                                                       |  21%
  |                                                                            
  |===============                                                       |  22%
  |                                                                            
  |================                                                      |  22%
  |                                                                            
  |================                                                      |  23%
  |                                                                            
  |=================                                                     |  24%
  |                                                                            
  |=================                                                     |  25%
  |                                                                            
  |==================                                                    |  26%
  |                                                                            
  |===================                                                   |  27%
  |                                                                            
  |===================                                                   |  28%
  |                                                                            
  |====================                                                  |  29%
  |                                                                            
  |=====================                                                 |  30%
  |                                                                            
  |======================                                                |  31%
  |                                                                            
  |======================                                                |  32%
  |                                                                            
  |=======================                                               |  33%
  |                                                                            
  |=======================                                               |  34%
  |                                                                            
  |========================                                              |  35%
  |                                                                            
  |=========================                                             |  36%
  |                                                                            
  |==========================                                            |  37%
  |                                                                            
  |===========================                                           |  38%
  |                                                                            
  |===========================                                           |  39%
  |                                                                            
  |============================                                          |  39%
  |                                                                            
  |============================                                          |  40%
  |                                                                            
  |============================                                          |  41%
  |                                                                            
  |=============================                                         |  41%
  |                                                                            
  |==============================                                        |  43%
  |                                                                            
  |===============================                                       |  44%
  |                                                                            
  |===============================                                       |  45%
  |                                                                            
  |================================                                      |  45%
  |                                                                            
  |=================================                                     |  47%
  |                                                                            
  |==================================                                    |  48%
  |                                                                            
  |==================================                                    |  49%
  |                                                                            
  |===================================                                   |  49%
  |                                                                            
  |===================================                                   |  50%
  |                                                                            
  |===================================                                   |  51%
  |                                                                            
  |====================================                                  |  51%
  |                                                                            
  |=====================================                                 |  53%
  |                                                                            
  |======================================                                |  54%
  |                                                                            
  |======================================                                |  55%
  |                                                                            
  |=======================================                               |  55%
  |                                                                            
  |========================================                              |  57%
  |                                                                            
  |=========================================                             |  58%
  |                                                                            
  |=========================================                             |  59%
  |                                                                            
  |==========================================                            |  60%
  |                                                                            
  |===========================================                           |  61%
  |                                                                            
  |============================================                          |  62%
  |                                                                            
  |============================================                          |  63%
  |                                                                            
  |=============================================                         |  64%
  |                                                                            
  |==============================================                        |  65%
  |                                                                            
  |==============================================                        |  66%
  |                                                                            
  |===============================================                       |  67%
  |                                                                            
  |================================================                      |  68%
  |                                                                            
  |================================================                      |  69%
  |                                                                            
  |=================================================                     |  70%
  |                                                                            
  |==================================================                    |  71%
  |                                                                            
  |==================================================                    |  72%
  |                                                                            
  |===================================================                   |  72%
  |                                                                            
  |===================================================                   |  73%
  |                                                                            
  |====================================================                  |  74%
  |                                                                            
  |=====================================================                 |  75%
  |                                                                            
  |=====================================================                 |  76%
  |                                                                            
  |======================================================                |  77%
  |                                                                            
  |=======================================================               |  78%
  |                                                                            
  |=======================================================               |  79%
  |                                                                            
  |========================================================              |  79%
  |                                                                            
  |========================================================              |  80%
  |                                                                            
  |=========================================================             |  81%
  |                                                                            
  |==========================================================            |  82%
  |                                                                            
  |==========================================================            |  83%
  |                                                                            
  |===========================================================           |  84%
  |                                                                            
  |===========================================================           |  85%
  |                                                                            
  |============================================================          |  85%
  |                                                                            
  |============================================================          |  86%
  |                                                                            
  |=============================================================         |  87%
  |                                                                            
  |==============================================================        |  88%
  |                                                                            
  |==============================================================        |  89%
  |                                                                            
  |===============================================================       |  89%
  |                                                                            
  |===============================================================       |  90%
  |                                                                            
  |================================================================      |  91%
  |                                                                            
  |================================================================      |  92%
  |                                                                            
  |=================================================================     |  93%
  |                                                                            
  |==================================================================    |  94%
  |                                                                            
  |==================================================================    |  95%
  |                                                                            
  |===================================================================   |  96%
  |                                                                            
  |====================================================================  |  97%
  |                                                                            
  |===================================================================== |  98%
  |                                                                            
  |===================================================================== |  99%
  |                                                                            
  |======================================================================| 100%

Let’s take a look at these files. Warning, if you’re working on a slow machine, don’t view it, as it may crash your R instance. Just watch the video. The last column has geometry information, a collection of lat and long points that form a polygon shape.

We’re also going to pull down “shifted” geometry, because it will help us make better maps. Instead of putting Hawaii and Alaska in their usual spots on the globe – which makes the map unncessarily large – it moves them into a spot right near the continental U.S. You’ve seen maps like this. I’ll show you an example in a second.

county_geodata_shifted <- get_acs(geography = "county",
              variables = "B01001_001", geometry = TRUE, shift_geo = TRUE)

state_geodata_shifted <- get_acs(geography = "state",
              variables = "B01001_001", geometry = TRUE, shift_geo = TRUE)

Look at our shapefiles

Let’s take a look at the columns in our shapefiles. We can View, but if you’re computer is too slow to View, run the code below. Unfortunately, we can’t use glimpse() or summary() on the file now that it contains geo information, for some reason.

print(head(state_geodata_shifted))
## Simple feature collection with 6 features and 5 fields
## geometry type:  MULTIPOLYGON
## dimension:      XY
## bbox:           xmin: -2031905 ymin: -1470717 xmax: 2295505 ymax: 67481.2
## epsg (SRID):    NA
## proj4string:    +proj=laea +lat_0=45 +lon_0=-100 +x_0=0 +y_0=0 +a=6370997 +b=6370997 +units=m +no_defs
##   GEOID                 NAME   variable estimate moe
## 1    04              Arizona B01001_001  6946685  NA
## 2    05             Arkansas B01001_001  2990671  NA
## 3    06           California B01001_001 39148760  NA
## 4    08             Colorado B01001_001  5531141  NA
## 5    09          Connecticut B01001_001  3581504  NA
## 6    11 District of Columbia B01001_001   684498  NA
##                         geometry
## 1 MULTIPOLYGON (((-1111066 -8...
## 2 MULTIPOLYGON (((557903.1 -1...
## 3 MULTIPOLYGON (((-1853480 -9...
## 4 MULTIPOLYGON (((-613452.9 -...
## 5 MULTIPOLYGON (((2226838 519...
## 6 MULTIPOLYGON (((1960720 -41...

Build a simple map just from shapefiles

Let’s make a simple map just from our population shapefiles to see how ggplot2 treats geospatial data.

Three lines of code is all we need. First, the dataframe we’re doing this to, which contains our spatial information. Then we tell it to make a ggplot graphic, using our population information (the variable is called “estimate” in our table) to fill in the shapes with a color gradient. Then geom_sf() says “take the information in the geometry column” and draw a map!

state_geodata_shifted %>%
  ggplot(aes(fill = estimate)) +
  geom_sf()

# Turn off scientific notation if your legend looks weird
options(scipen=999)

Nice! Okay, just like with ggplot2 in the last lab, we can keep adding stuff to our function to fix the styling. I don’t know about you, but those geographic coordinates are useless information, so we can get rid of those. Fortunately, the theme_map() function that’s part of ggthemes package will take care of it.

state_geodata_shifted %>%
  ggplot(aes(fill = estimate)) +
  geom_sf() +
  theme_map()

I don’t love those state line borders. Let’s set the width (lwd) to 0.

state_geodata_shifted %>%
  ggplot(aes(fill = estimate)) +
  geom_sf(lwd = 0) +
  theme_map()

Now, let’s clean up the legend title, put commas in the legend labels, and move the legend to the right.

state_geodata_shifted %>%
  ggplot(aes(fill = estimate)) +
  geom_sf(lwd = 0) +
  theme_map() +
  labs(fill='Population') +
  scale_fill_continuous(labels = comma) +
  theme(legend.position="right")

Let’s give it a title, subtitle and source.

state_geodata_shifted %>%
  ggplot(aes(fill = estimate)) +
  geom_sf(lwd = 0) +
  theme_map() +
  labs(fill='Population', title="California, Texas have largest populations", subtitle = "2017 population, U.S. Census", caption = "Source: U.S. Census ACS") +
  scale_fill_continuous(labels = comma) +
  theme(legend.position="right")

And lastly, let’s give it a cooler color scheme, viridis.

state_geodata_shifted %>%
  ggplot(aes(fill = estimate)) +
  geom_sf(lwd = 0) +
  theme_map() +
  labs(fill='Population',title="California, Texas have largest populations", subtitle = "2017 population, U.S. Census", caption = "Source: U.S. Census ACS") +
  theme(legend.position="right") +
  scale_fill_viridis_c(option = "magma",labels = comma)

# Also try "plasma", "viridis", "cividis"

Note: we’re just scratching the surface here of all the options for mapping in ggplot2. Link. And there are other mapping libraries we can use.

Simple Interactive Map

Before we move on, I want to show you how we can create a simple interactive map we can use to further explore the data with the mapview library.

After we’ve loaded the mapview package, the function we use is called “mapview”. We feed it the name of our data set, the name of the variable we want to visualize and indicate if we want a legend. That’s pretty much it.

mapview(state_geodata_shifted, zcol = "estimate", legend = TRUE)

Pills per person state map

Okay, so now our goal is to look for regional trends in opioid shipment rates by building a state map that shows pills per person shipped per year.

Remember our pills_population table we made above? Let’s make a state table looking at state totals in 2009.

First, we filter just for 2009. Then we group by the state fips code and buyer state. Then we add up all of the pills shipped to that state in 2009 and the population for that state in 2009, creating two new columns.

pills_population_state_2009 <- pills_population %>%
  filter(year==2009) %>%
  group_by(statefips, buyer_state) %>%
  summarise(total_pills=sum(dosage_unit, na.rm = TRUE),
            total_population=sum(population, na.rm=TRUE)) %>%
  mutate(pills_per_person=total_pills/total_population)

Now, let’s join it to our shapefile data for our states, using the GEOID from our shapefile and statefips, the column we created with the state code.

geo_pills_population_state_2009 <- state_geodata_shifted %>%
 inner_join(pills_population_state_2009, by=c("GEOID" = "statefips"))

Okay, first, let’s build an interactive version of the map.

Task: Using the mapview() function we used above, show an interactive map that shades each state according to the pills per person rate in 2009.

# interactive map code here

mapview(geo_pills_population_state_2009, zcol = "pills_per_person", legend = TRUE)

Now let’s plot a basic version.

geo_pills_population_state_2009 %>%
  ggplot(aes(fill = pills_per_person)) +
  geom_sf()

Task: stylize the map, making it look like the final population map we created in the first part of the exercise with two changes: center the legend on the bottom of the graphic, instead of at right. You might try searching for legend.position on Google. And write up a headline, subtitle and source appropriate for the chart.

# Map code here

geo_pills_population_state_2009 %>%
  ggplot(aes(fill = pills_per_person)) +
  geom_sf(lwd = 0) +
  theme_map() +
  labs(fill='Pills per Person',title="West Virginia had the most pills per person in 2009", subtitle = "High ratios also found in KY and TN, among other states", caption = "Source: DEA ARCOS database, via Washington Post") +
  theme(legend.position="bottom") +
  scale_fill_viridis_c(option = "magma",labels = comma)

Facets

Let’s say we want to create a single map for each year between 2006 and 2012, to see how the rate of pills per person changed over time in each state.

First, we’ll group by statefips, buyer state and year. Then, we’ll calculate total pills and total population for each state for each year, before piping those columns into our pills per person calculation.

That will give us one pills_per_person rate per state per year.

pills_population_state <- pills_population %>%
  group_by(statefips, buyer_state, year) %>%
  summarise(total_pills=sum(dosage_unit, na.rm = TRUE),
            total_population=sum(population, na.rm=TRUE)) %>%
  mutate(pills_per_person=total_pills/total_population)

Now, let’s join it to our shapefile data for our states, using the GEOID from our shapefile and statefips, the column we created with the state code.

geo_pills_population_state <- state_geodata_shifted %>%
 inner_join(pills_population_state, by=c("GEOID" = "statefips"))

And now, let’s plot it. This is the same as above, but I’ve added one line: facet_wrap(~year, nrow=2).

This says: make one chart for each year, using the year column and organize the grid into two rows.

geo_pills_population_state %>%
  ggplot(aes(fill = pills_per_person)) +
  facet_wrap(~year, nrow=2) +
  geom_sf(lwd = 0) +
  theme_map() +
  labs(fill='Pills per Person',title="", subtitle = "", caption = "Source: ARCOS database") +
  theme(legend.position="bottom") +
  scale_fill_viridis_c(option = "magma",labels = comma)

Task: copy the code above and paste it below. Modify the title and subtitle to describe the trend you identify with the map. What happens between 2006 and 2012?

# Map code here
geo_pills_population_state %>%
  ggplot(aes(fill = pills_per_person)) +
  facet_wrap(~year, nrow=2) +
  geom_sf(lwd = 0) +
  theme_map() +
  labs(fill='Pills per Person',title="Pills per person increase from 2006 to 2012 seen in most states nationwide", subtitle = "WV, KY among those hit hardest by opioid epidemic", caption = "Source: ARCOS database") +
  theme(legend.position="bottom") +
  scale_fill_viridis_c(option = "magma",labels = comma)

Task: Make a single INTERACTIVE map that is colored based on the number of pills per person in each county in 2009. Provide a short writeup on what you find.What trends do you see, beyond Appalachia? Note: the map will probably open in a web browser.

pills_population_county_2009 <- pills_population %>%
  filter(year==2009) %>%
  group_by(countyfips, buyer_county) %>%
  summarise(total_pills=sum(dosage_unit, na.rm = TRUE),
            total_population=sum(population, na.rm=TRUE)) %>%
  mutate(pills_per_person=total_pills/total_population)
geo_pills_population_county_2009 <- county_geodata_shifted %>%
 inner_join(pills_population_county_2009, by=c("GEOID" = "countyfips"))
# Map Code Here

mapview(geo_pills_population_county_2009, zcol = "pills_per_person", legend = TRUE)
# Beyond Appalachia, pills per person figures seem higher in the West.

Task: Make a static facet map, with one map per year between 2006 and 2012, showing only Maryland counties, colored based on the number of pills per person in each county in each year. Provide a short writeup on what you find. What trends do you see?

# map code here

pills_population_county_maryland <- pills_population %>%
  filter(statefips==24 & buyer_state=="MD" & year %in% c(2006:2012)) %>%
  group_by(countyfips, buyer_county, buyer_state, year) %>%
  summarise(total_pills=sum(dosage_unit, na.rm = TRUE),
            total_population=sum(population, na.rm=TRUE)) %>%
  mutate(pills_per_person=total_pills/total_population)
geo_pills_population_county_maryland <- county_geodata_shifted %>%
 inner_join(pills_population_county_maryland, by=c("GEOID" = "countyfips"))
geo_pills_population_county_maryland %>%
  ggplot(aes(fill = pills_per_person)) +
  facet_wrap(~year, nrow=2) +
  geom_sf(lwd = 0) +
  theme_map() +
  labs(fill='Pills per Person',title="", subtitle = "", caption = "Source: ARCOS database") +
  theme(legend.position="bottom") +
  scale_fill_viridis_c(option = "magma",labels = comma)

# In most Maryland counties, pills per person figures increase from 2006 to 2010-2011, then decline in 2012.

Submission

Save the R Markdown file. Knit it to HTML and make sure it compiles correctly. Upload to GitHub, as instructed. Provide links to GitHub in ELMS.